GFTE: Graph-Based Financial Table Extraction
نویسندگان
چکیده
Tabular data is a crucial form of information expression, which can organize in standard structure for easy retrieval and comparison. However, financial industry many other fields, tables are often disclosed unstructured digital files, e.g. Portable Document Format (PDF) images, difficult to be extracted directly. In this paper, facilitate deep learning based table extraction from we publish Chinese dataset named FinTab, contains more than 1,600 diverse kinds their corresponding representation JSON. addition, propose novel graph-based convolutional neural network model GFTE as baseline future integrates image feature, position feature textual together precise edge prediction reaches overall good results https://github.com/Irene323/GFTE.
منابع مشابه
Pattern-Based Approach to Table Extraction
In this paper, we address a client-driven approach to automatically extract information content within the table in document images. We start with a graph-based representation of a set of key-fields selected by clients and perform graph mining in a document in order to learn them to produce a model. Such models are aimed to use to extract information content in the absence of clients. To avoid ...
متن کاملGraph-based Event Extraction from Twitter
Detecting which tweets describe a specific event and clustering them is one of the main challenging tasks related to Social Media currently addressed in the NLP community. Existing approaches have mainly focused on detecting spikes in clusters around specific keywords or Named Entities (NE). However, one of the main drawbacks of such approaches is the difficulty in understanding when the same k...
متن کاملUnsupervised Graph Based Video Object Extraction
A method to extract the object containing regions in the ‘object proposal’ set in the video is employed. The segmented object containing areas are then employed to construct segmentation models for optimal video object extraction. First, an unsupervised graph based framework is used for detection and extraction of foreground in the video. We take into account the general properties (spatially c...
متن کاملGraph Grammar Based Web Data Extraction
Web data extraction becomes a hot topic after the invention of World Wide Web, because the large amount of information on the Web makes it challenging to retrieve useful information. Due to the diverse designs and presentations of information on different Web sites, it is hard to implement a general solution to extract data across different Web sites. This paper presents a novel method based on...
متن کاملGraph-Based Spatio-temporal Region Extraction
Motion-based segmentation is traditionally used for video object extraction. Objects are detected as groups of significant moving regions and tracked through the sequence. However, this approach presents difficulties for video shots that contain both static and dynamic moments, and detection is prone to fail in absence of motion. In addition, retrieval of static contents is needed for high-leve...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Lecture Notes in Computer Science
سال: 2021
ISSN: ['1611-3349', '0302-9743']
DOI: https://doi.org/10.1007/978-3-030-68790-8_50